Crate datafusion_expr
source ·Expand description
DataFusion is an extensible query execution framework that uses Apache Arrow as its in-memory format.
This crate is a submodule of DataFusion that provides types representing logical query plans (LogicalPlan) and logical expressions (Expr) as well as utilities for working with these types.
The expr_fn module contains functions for creating expressions.
Re-exports
pub use aggregate_function::AggregateFunction;
pub use expr::Between;
pub use expr::BinaryExpr;
pub use expr::Case;
pub use expr::Cast;
pub use expr::Expr;
pub use expr::GetIndexedField;
pub use expr::GroupingSet;
pub use expr::Like;
pub use expr::TryCast;
pub use expr_schema::ExprSchemable;
pub use function::AccumulatorFunctionImplementation;
pub use function::ReturnTypeFunction;
pub use function::ScalarFunctionImplementation;
pub use function::StateTypeFunction;
pub use logical_plan::builder::build_join_schema;
pub use logical_plan::builder::union;
pub use logical_plan::builder::wrap_projection_for_join_if_necessary;
pub use logical_plan::builder::UNNAMED_TABLE;
pub use logical_plan::LogicalPlanBuilder;
pub use window_frame::WindowFrame;
pub use window_frame::WindowFrameBound;
pub use window_frame::WindowFrameUnits;
pub use window_function::BuiltInWindowFunction;
pub use window_function::WindowFunction;
pub use expr_fn::*;
Modules
Aggregate function module contains all built-in aggregate functions definitions
Expr module contains core type definition for
Expr
.Functions for creating logical expressions
Expression rewriter
Expression visitor
Utility functions for complex field access
Function module contains typing and signature for built-in and user defined functions.
Type coercion rules for DataFusion
Expression utilities
Window frame module
Window functions provide the ability to perform calculations across
sets of rows that are related to the current query row.
Structs
Aggregates its input based on a set of grouping and aggregate
expressions (e.g. SUM).
Logical representation of a user-defined aggregate function (UDAF)
A UDAF is different from a UDF in that it is stateful across batches.
Creates a catalog (aka “Database”).
Creates a schema.
Creates an external table.
Creates an in memory table.
Creates a view.
Apply Cross Join to two logical plans
Describe the schema of table
Removes duplicate rows from the input
The operator that modifies the content of a database (adapted from substrait WriteRel)
Drops a table.
Drops a view.
Produces no rows: An empty relation with an empty schema
Produces a relation with string representations of
various parts of the plan
Extension operator defined outside of DataFusion
Filters rows from its input that do not match an
expression (essentially a WHERE clause with a predicate
expression).
Join two logical plans on one or more join columns
Produces the first
n
tuples from its input and discards the rest.Evaluates an arbitrary list of expressions (essentially a
SELECT with an expression list) on its input.
Repartition the plan based on a partitioning scheme.
Logical representation of a UDF.
Set a Variable’s value – value in
ConfigOptions
The Signature of a function defines its supported input types as well as its volatility.
Sorts its input according to a list of sort expressions.
Represents some sort of execution plan, in String form
Subquery
Aliased subquery
Produces rows from a table provider by reference or from the context
Union multiple inputs
Unnest a column that contains a nested list type.
Values expression. See
Postgres VALUES
documentation for more details.
Window its input based on a set of window spec and window function (e.g. SUM or RANK)
Enums
Enum of all built-in scalar functions
Represents the result of evaluating an expression: either a single
ScalarValue
or an [ArrayRef
].Join constraint
Join type
A LogicalPlan represents the different types of relational
operators (such as Projection, Filter, etc) and can be created by
the SQL query planner and the DataFrame API.
Operators applied to expressions
Logical partitioning schemes supported by the repartition operator.
Represents which type of plan, when storing multiple
for use in EXPLAIN plans
! Table source
Indicates whether and how a filter expression can be handled by a
TableProvider for table scans.
Indicates the type of this table for metadata/catalog purposes.
A function’s type signature, which defines the function’s supported argument types.
A function’s volatility, which defines the functions eligibility for certain optimizations
Statics
Currently supported types by the nullif function.
The order of these types correspond to the order on which coercion applies
This should thus be from least informative to most informative
Traits
An accumulator represents a stateful object that lives throughout the evaluation of multiple rows and
generically accumulates values.
Trait that implements the Visitor
pattern for a
depth first walk of
LogicalPlan
nodes. pre_visit
is called
before any children are visited, and then post_visit
is called
after all children have been visited.
To use, define a struct that implements this trait and then invoke
LogicalPlan::accept
.The TableSource trait is used during logical query planning and optimizations and
provides access to schema information and filter push-down capabilities. This trait
provides a subset of the functionality of the TableProvider trait in the core
datafusion crate. The TableProvider trait provides additional capabilities needed for
physical query execution (such as the ability to perform a scan). The reason for
having two separate traits is to avoid having the logical plan code be dependent
on the DataFusion execution engine. Other projects may want to use DataFusion’s
logical plans and have their own execution engine.
Trait for converting a type to a literal timestamp
Trait for something that can be formatted as a stringified plan
This defines the interface for
LogicalPlan
nodes that can be
used to extend DataFusion with custom relational operators.Functions
Create a literal expression
Create a literal timestamp expression